Here, we can see that there are a lot of outliers and we can use median to to fill all the null values in the following columns: - AMOUNT_RUB_CLO_PRC - AMOUNT_RUB_SUP_PRC - AMOUNT_RUB_NAS_PRC - AMOUNT_RUB_ATM_PRC Here, we can use mean to fill all the null values in the following column : - TRANS_COUNT_SUP_PRC - TRANS_COUNT_NAS_PRC - TRANS_COUNT_ATM_PRC Here, we can use mean to fill all the null values in the following column : - CNT_TRAN_ATM_TENDENCY3M - SUM_TRAN_ATM_TENDENCY3M Here, we can use mean to fill all the null values in the following column : - TRANS_AMOUNT_TENDENCY3M - TRANS_CNT_TENDENCY3M

Confusion Matrix (Training Set): [[12340 7994] [ 5952 14382]] Accuracy (Train): 0.6570768171535359 Precision (Train): 0.6427422238112264 Recall (Train): 0.7072882856299794 F1 Score (Train): 0.6734722547412784 ROC AUC Score (Train): 0.716468372977995 Confusion Matrix (Test Set): [[59006 38960] [ 2534 6057]] Accuracy (Test): 0.6105933913304616 Precision (Test): 0.13454917031343713 Recall (Test): 0.7050401583052032 F1 Score (Test): 0.2259737352633935 ROC AUC Score (Test): 0.7128585583248431

We found AGE_GROUP_Group 4, LDEAL_GRACE_DAYS_PCT_MED, PACK_301, TURNOVER_CC, etc were the most effective local features Now, for Global Feature, we can see that these 10 were the top selected features for the model and are mentioned below along with their importance value. Feature: AGE_GROUP_Group 4, Importance: -0.0027253763333527697 Feature: AMOUNT_RUB_ATM_PRC, Importance: -0.0029518696464307425 Feature: AMOUNT_RUB_SUP_PRC, Importance: -0.0030826754407552153 Feature: TRANS_COUNT_NAS_PRC, Importance: -0.0030865940822398944 Feature: TURNOVER_DYNAMIC_CUR_1M, Importance: -0.0031280254084610206 Feature: TURNOVER_DYNAMIC_CC_3M, Importance: -0.003149758587542671 Feature: LDEAL_GRACE_DAYS_PCT_MED, Importance: -0.003168928654897231 Feature: CNT_TRAN_ATM_TENDENCY3M, Importance: -0.0031717590710851422 Feature: REST_DYNAMIC_PAYM_3M, Importance: -0.00317533773450771 Feature: REST_DYNAMIC_CUR_1M, Importance: -0.003177373507806623`